34 research outputs found
Adjective Density as a Text Formality Characteristic for Automatic Text Classification: A Study Based on the British National Corpus
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Latin Etymologies as Features on BNC Text Categorization
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Gene prioritization of resistant rice gene against Xanthomas oryzae pv. oryzae by using text mining technologies
To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization